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ADAPTIVE CONFIDENCE BALLS 1 
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Adaptive confidence balls are constructed for individual reso- 
lution levels as well as the entire mean vector in a multiresolution 
framework. Finite sample lower bounds are given for the minimum 
expected squared radius for confidence balls with a prespecified confi- 
dence level. The confidence balls are centered on adaptive estimators 
based on special local block thresholding rules. The radius is derived 
from an analysis of the loss of this adaptive estimator. In addition 
adaptive honest confidence balls are constructed which have guaran- 
teed coverage probability over all of 9. N and expected squared radius 
adapting over a maximum range of Besov bodies. 

1. Introduction. A central goal in nonparametric function estimation, 
and one which has been the focus of much attention in the statistics lit- 
erature, is the construction of adaptive estimators. Informally, an adaptive 
procedure automatically adjusts to the smoothness properties of the under- 
lying function. A common way to evaluate such a procedure is to compute 
its maximum risk over a collection of parameter spaces and to compare these 
values to the minimax risk over each of them. 

It should be stressed that such adaptive estimators do not provide a 
data-dependent estimate of the loss, nor do they immediately yield easily 
constructed adaptive confidence sets. Such confidence sets should have size 
which adapts to the smoothness of the underlying function while maintaining 
a prespecified coverage probability over a given function space. Moreover, it 
is clearly desirable to center such confidence sets on estimators which pos- 
sess other strong optimality properties. In the present paper, a confidence 
ball is constructed centered on a special block thresholding rule which has 
particularly good spatial adaptivity. The radius is built upon good estimates 
of loss. 
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We focus on a sequence of statistical models commonly used in the adap- 
tive estimation literature, namely, a multivariate normal model with mean 
vector corresponding to wavelet coefficients. More specifically, consider the 
models 

(1) yj,k = 9j s k + —j=Zj>k> j = 0,l,...,J-l,fc = l,...,2 J , 

V n 

where Zj k 1 ~ N(Q, 1) and where it is assumed that N is a function of n, 
2 J — 1 = N and that the mean vector 9 lies in a parameter space 0. In 
the present work, confidence balls are constructed over collections of Besov 
bodies 

f ( J ~ x ( ( 2i \ X l p \ q \ 1 

(2) ^(M) = |e^E^^Ei^rJ J J < M ^ 

where s = /3 + i — - > and p > 2. In particular, these spaces contain as 
special number of traditional smoothness classes such as Sobolev and 

Holder spaces. Although not needed for the development given in this paper, 
it may be helpful to think of the Oj t k as wavelet coefficients of a regression 
function /. A confidence ball for the vector 9 then yields a corresponding 
confidence ball for the regression function /. See, for example, [8], where such 
an approach is taken. Based on the model (1), we introduce new estimates of 
the loss of block thresholding estimators and use these estimates to construct 
confidence balls. 

In the context of confidence balls, adaptation over a general collection 
of parameter spaces C = {0j : i € 1} where / is an index set can be made 
precise as follows. An adaptive confidence ball guarantees a given coverage 
probability over the union of these spaces while simultaneously minimizing 
the maximum expected squared radius over each of the parameter spaces. 
Write B a) ® for the collection of all confidence balls which have coverage 
probability of at least 1 — a over G. Write r 2 (CB,@) for the maximum 
expected squared radius of a confidence ball CB over and r„(0) for the 
minimax expected squared radius over confidence balls in B a> Q. Then r^(0) 
is the smallest maximum expected squared radius of confidence balls with 
guaranteed coverage over 0. Adaptation over the collection C can then be 
defined as follows. Let 0/ = \J ie j ®i- A confidence ball CB E B Qj e 7 is called 
adaptive over C if for all i £ /, r 2 (CB , 0j) < Cjr^(0j) where Cj are constants 
not depending on n, and we say that adaptation is possible over C if such a 
procedure exists. 

In a multivariate normal setup as given in the model (1) with N = n, Li 
[11] constructs adaptive confidence balls for the mean vector which have a 
given coverage over all of R . It was shown that under this constraint the 
squared radius of the ball must, with high probability, be bounded from 



ADAPTIVE CONFIDENCE BALLS 



3 



below by ere -1 / 4 for all choices of the unknown mean vector. Moreover a 
confidence ball was constructed centered on a shrinkage estimator which 
attains this lower bound at least for some subsets of R^. 

Hoffmann and Lepski [9] introduce the concept of a random normalizing 
factor into the study of nonparametric function estimation and used this 
idea to construct asymptotic confidence balls which adapt over a collection 
of finitely many parameter spaces. In particular, their results can be used 
to yield asymptotic confidence balls which adapt over a finite number of 
Sobolev bodies. Baraud [1] is a further development of both Li [11] and 
Hoffman and Lepski [9] concentrating on confidence balls which perform 
well over a finite family of linear subspaces. An honest confidence ball over 
1^ was constructed such that the radius adapts with high probability to a 
given collection of subspaces. 

Juditsky and Lambert-Lacroix [10] develop adaptive L2 confidence balls 
for a function / in a nonparametric regression setup with equally spaced de- 
sign. The paper used unbiased estimates of risk to construct minimax rate 
adaptive procedures over Besov spaces. It focused on the asymptotic perfor- 
mance and detailed finite sample results were not given. Robins and van der 
Vaart [12] use sample splitting to divide the construction of the center and 
radius of a confidence ball into independent problems and show how to use 
estimates of quadratic functionals to construct adaptive confidence balls. 

In the present paper the focus is on finite sample properties of adaptive 
confidence balls centered on a special local block thresholding estimator 
known to have strong adaptivity under mean integrated squared error. The 
radius is derived from an analysis of the loss of this adaptive estimator. The 
evaluation of the performance of the resulting confidence ball relies on a 
detailed understanding of the interplay between these two estimates. Three 
cases of interest are considered in detail. We first construct confidence balls 
for the mean vector at individual resolution levels. Then adaptive confidence 
balls are constructed for all N coefficients over Besov bodies. Finally we 
consider honest confidence balls over all of W N and expected squared radius 
adapting over a maximum range of Besov bodies. 

The paper is organized as follows. Section 2 is focused on constructing 
confidence balls for the mean vector associated with a single resolution level 
j in the Gaussian model (1). These confidence balls can be used in a mul- 
tiresolution study. Finite sample lower bounds are given for the expected 
squared radius of confidence balls which have a prescribed minimum cov- 
erage level over a given Besov body. Bounds are given for the maximum 
expected squared radius as well as when the mean vector is equal to zero. 
Confidence balls which have an expected squared radius within a constant 
factor of both these lower bounds are constructed. We show that the prob- 
lem is degenerate over a certain range of Besov bodies beyond which full 
adaptation is possible. Adaptive confidence balls are constructed centered 
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on a block thresholding estimator. The results and ideas given in this sec- 
tion are used as building blocks in the analysis and construction of adaptive 
confidence balls for all N coefficients in Sections 3 and 4. 

The focus of Section 3 is on the construction and analysis of confidence 
balls with a specified minimal coverage probability over a given Besov body 
B@ q (M). It is shown that the possible range of adaptation depends on the re- 
lationship between the dimension N and the noise level. Adaptive confidence 
balls are constructed over a maximal range of Besov bodies. These results 
are markedly different from the bounds derived for adaptive estimation or 
adaptive confidence intervals. 

In Section 4 confidence balls are constructed which have guaranteed cov- 
erage probability over all of R . This procedure has a number of strong 
optimality properties. It adapts over a maximal range of Besov bodies over 
which honest confidence balls can adapt. Moreover, given that the confi- 
dence ball has a prespecified coverage probability over M. N , it has maximum 
expected squared radius within a constant factor of the smallest maximum 
expected squared radius for all Besov bodies B^ q (M) with j3 > and M > 1. 

Proofs are given in Section 5. 

2. Adaptive confidence balls for a single resolution level. As mentioned 
in the Introduction, the mean 9j k in the model (1) can be thought of as the 
A;th coefficient at level j in a wavelet expansion of a function / . The different 
levels j allow for a multiresolution analysis where the coefficients with small 
values of j correspond to coarse features and where the coefficients with 
large values of j correspond to fine features. In this section we first fix a 
level j and focus not only on estimating the sequence of means at that level 
but also on constructing honest confidence balls for this set of coefficients. 

Confidence balls are constructed which maintain coverage no matter the 
values of 8j t k and have an expected radius adapting to these coefficients over 
a range of Besov bodies. The analysis given in this section also provides 
insight (as is shown in Sections 3 and 4) into the problem of estimating all 
the wavelet coefficients across different levels. 

In the following analysis, for a given level j, write 9j for the sequence of 
mean values at this given resolution level. That is, 9j = {9j : k ■ k = 1, . . . , 2 J }. 
The analysis can then naturally be divided into two parts. We start with 
lower bounds for the expected squared radius of confidence balls which have 
a given coverage probability over a given Besov body. Two lower bounds 
are given. One is for the expected squared radius when all the coefficients 
are zero. The other is for the maximum expected squared radius. Set z a = 
<E> -1 (1 — a), where $ is the cumulative distribution function of a standard 
Normal random variable. 
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Theorem 1. Fix < a < \ and let CB(5,r a ) = {9j : \\9j - 6\\2 < r a } be a 
confidence ball for 9j with random radius r a which has a guaranteed coverage 
probability over B@ q {M) of at least 1 — a. Then for any < e < — a) 

(3) sup E e (rl) > f _ min(M 2 2~ 2 ^ , z 2 +2£ 2^). 

ee< ? (M) a £ 

Moreover, for any < e < \ — a, 

(4) E (r 2 a ) > i(l -2a- 2e) min(M 2 2" 2ft ', log 1/2 (l + e 2 )2 j l 2 n~ l ), 
where Eq denotes expectation under 9 = 0. 



It is useful to note that the maximum value of Ylk @j k a ^ a gi ven level j 
over the Besov body B@ q (M) is M 2 2~ 2f3 K Hence, from (4), if M 2 2~ 2 ^ < 

log 1 ^ 2 (l + e 2 )2 J//2 n _1 the lower bound for the expected squared radius when 
the mean vector is equal to zero is a constant multiple of M 2 2~ 2 ^K It fol- 
lows that if a given coverage probability is guaranteed over BP q (M) then the 
maximum expected squared radius over any other Besov body must also be 
of this same order. It should be stressed that this is really a degenerate case 
since the trivial ball centered at zero with squared radius equal to M 2 2 -2 ™ 
is within a constant factor of the lower bounds given in (3) and (4) and has 
coverage probability equal to one. Thus we shall focus only on the construc- 
tion of confidence balls which have a given coverage probability at least over 
Besov bodies where M 2 2 _2 ^ J > log 1 / 2 (l + e 2 )2^ 2 n~ 1 . In particular, we only 
need to consider resolution levels j = j n satisfying 2 J < n 2 since resolution 
levels with 2 j > n 2 satisfy M 2 2~ 2 ^ < log 1/2 (l + e 2 )2 J '/ 2 n^ 1 at least for large 
n. Moreover, since little is to be gained for levels where 2 3 < logn, by using 
confidence balls with random radius in such cases we shall just use the usual 
100(1 — a)% confidence ball centered on the observations yj k- Thus in the 
following construction attention is focused on cases where logn < 2 J < n 2 . 

As mentioned in the Introduction, the center of the ball is constructed by 
local thresholding. Set L = logn and let B\ = {(j, k):(i — 1)L + 1 < k < iL}, 
1 < i < 2 J /L, denote the set of indices of the coefficients in the ith block at 
level j. For a given block Bj, set 

(5) s 2 i= y],k, tj,i= e h and Z! z h- 

(j,k)eBj U,k)£B{ U,k)£B{ 

Let A* = 6.9368 be the root of the equation A — log A = 5. This threshold is 
similar to the one used in [4, 5]. Then the center 9 = (9j t k) is defined by 



(6) 



9j,k = Uj,k ■ I(Sj,i > KLn 1 ). 
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It follows from [5] that this local block thresholding rule has strong adap- 
tivity under both global and local risk measures. We now show how the loss 
\\9j — fljUl of this estimator can be estimated and used in the construction 
of the radius of the confidence ball. Note that 9j t k equals either or y^j. and 
hence the loss can be broken into two terms, 

(7) k i 

+ ^n- 1 Xj V(^>A,Ln- 1 ). 



The first term can be handled by using an estimate of a quadratic functional 

V/-' 



The other term can be analyzed using the fact that Xji has a central chi- 



squared distribution. 

Let denote max(0, x) and set 



rl 



(8) 



2 log 1 / 2 [- +4Ay 2 V 2 

-( \ 

+ - )I(Sji < A.Ln" 1 ) J 

+ (2A* + 8\l /2 - l)Ln- 1 Card{i : S] A > K.Ln~ 1 }- 
The confidence ball is then defined as 

(9) CB m 0j,r a ) = {9 3 : \\9j - %|| 2 < r a } 

where, when 2 3 > logn, the center 9j is given as in (6) and the radius given 
in (8) and where 9j = yj^ and r a is the radius of the usual 100(1 — a)% 
confidence ball when 2 3 < logn. 

Theorem 2. Let the confidence ball CB*(9,r a ) be given as in (9) and 
suppose that the resolution level j satisfies 2? < n 2 . Then 

(10) inf P(6j£ CB*{9,r a ))>l-a-2{logny\ 

eeR N 

and for a constant Cp depending only on 0, 



sup E(r%) < 
( n ) 8eB^ q (M) 



21og 1 /^fj+4Ai /2 V2 + 4 
+ Cp min(2 J n -1 , M 2 2~ 2(3j ) . 



yi 2 n- 1 



Note that the confidence ball constructed above attains the minimax 
lower bound given in (3) simultaneously over all Besov bodies B^ q {M) with 

M 2 2 -2 * > log 1//2 (l + e 2 )2^/ 2 n~ 1 . This is true even though the confidence 
ball has a given level of coverage for all 9 in M. N . 
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3. Adaptive confidence balls over Besov bodies. The confidence balls 
constructed in Section 2 focused on a given resolution level. In this section 
this construction is extended to the more complicated case of estimating 
all N coefficients of 9. Specifically, we consider adaptation over a collection 
of Besov bodies B@ (M) with p > 2. It should be stressed that the theory 
developed in this section for adaptive confidence balls is quite different from 
that of adaptive estimation theory where adaptation under global losses is 
possible over all Besov bodies. In particular, adaptation for confidence balls 
is only possible over a much smaller range of Besov bodies. 

In Section 3.1 a lower bound is given on both the maximum and the 
minimum expected squared radius for any confidence ball with a particular 
coverage probability over a Besov body. As in Section 2, these lower bounds 
provide a fundamental limit to the range of Besov bodies where adaptation is 
possible. Adaptive confidence balls are described in Section 3.2. They build 
on the construction given in Section 2. The center uses the special local 
block thresholding rule used in Section 2 up to a particular level and then 
estimates the remaining coordinates by zero. The radius is chosen based 
on an estimate of the loss of this block thresholding estimate. The analysis 
of the resulting confidence ball relies on a detailed understanding of the 
interplay between these two estimates. 

3.1. Lower bounds. Theorem 1 provides lower bounds for the expected 
squared radius of a confidence ball for the mean vector at a given resolution 
level with a given coverage over B@ q {M). In this section lower bounds are 
given for the expected squared radius for the whole mean vector for any 
confidence ball which has a given coverage probability over B^ q {M). There 
are two lower bounds, one for the maximum expected squared radius and 
one for the minimum expected squared radius. We shall show that these two 
lower bounds determine the range over which adaptation is possible. 

Theorem 3. Fix < a < ~ and let CB(5,r a ) = {9: \\9 - 5\\ 2 < r a } be a 
1 — a level confidence ball for 9 £ B@ JM) with random radius r a . Then 

sup E e (r 2 a ) 

> J 2 z 2 a+2e min(iVn- 1 , ^ 2 / £ (1+2/3) M 2 /( 1+2 ^n-^/( 1 + 2 ^). 

For any < e < \ - a, set 7 = log(l + e 2 ). For < M' < M set 

b e = mm(2^ 2 ( 1+4 ^- V /(1+4/?) (M - M / ) 1/(1+4/3) n- 2 ^ 1 + 4 ' 3 ), 
[6) i 7 1 /4 JV l/4 n -l/2 ) _ 

Then for all 9 G B^ q (M'), 

(14) P e {r a > b e ) > l-2a-2e 



<s 
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and consequently 

(15) inf E e (r 2 a )>(l-2a-2e)b 2 . 

6eB%, q (M') 

In fact, as is shown in the next section, both bounds are rate sharp in 
the sense that there are confidence balls with a given coverage probability 
over BP q (M) which have expected squared radius within a constant factor 
of the lower bounds given in (12) and (15). There are two cases of interest, 
namely, when N > n 2 and N < n 2 . First suppose that N > n 2 and fix a 
Besov body B^ q (M) over which it is assumed that the confidence ball has 
a given coverage probability. Then by (15) the minimum expected squared 
radius is at least of order from (12) the minimax expected 

squared radius for confidence balls over Bp q (M) is of order n~ 2T ^ 1+2T \ the 
confidence ball CB(5,r) must have expected squared radius larger than the 
minimax expected squared radius over any Besov body B^, , (M) whenever 
r > 2(3 and p' > 2. Hence in this case it is impossible to adapt over any 
Besov body with smoothness index r > 2(3. Consequently in this case there 
is a maximum range of Besov bodies over which full adaptation is possible. 

Now suppose that N <n 2 and that N x ra p where < p < 2. In this case 
the possible range of adaptation depends on the value of p. Let CB(S,r) 
be a confidence ball with guaranteed coverage probability over B@ q {M). 
First suppose that (3 > ^- — |. Then as above it is easy to check that the 

minimum expected squared radius is at least of order n -4 ^ 1-1-4 ^ and that 
it is impossible to adapt over Besov bodies with r > 2(3. On the other hand, 
suppose that (3 < ± - \. Then by (15), the minimum expected squared 

radius is at least of order n' 3 ' 2-1 , which is the minimax rate of convergence 

for the squared radius over a Besov body with (3 = - — |. Hence in this 

case it is impossible to adapt over any Besov body with smoothness index 

r > i - i 
T ^ p 2- 

In summary, for a confidence ball with a prespecified coverage probability 
over a Besov body B^ q (M) the maximum range of Besov bodies Bp q (M) 
over which full adaptation is possible is given in Table 1. 



Table 1 



N, n and (3 


Maximum range of adaptation 


N>n 2 , all/?>0 

N = n" for < p < 2, > ^ - \ 

N = n p for < p <2, < < ± - ± 


< t < 20 
0<r<20 
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3.2. Construction of adaptive confidence balls. In this section the focus 
is on confidence balls which have a given minimal coverage over a particu- 
lar Besov body. Subject to this constraint, confidence balls are constructed 
which have expected squared radius adapting across a range of Besov bod- 
ies. The resulting balls are shown to be adaptive over the maximal range 
of Besov bodies given in Table 1 for the first two cases summarized in the 
table. The third case is covered in Section 4. 

The ball is centered on a local thresholding rule and the squared radius is 
based on an analysis of the loss of this thresholding rule. More specifically, 
for the center 9, let J\ be the largest integer satisfying 



(16) 



2 Jl <min(iV,M 2 /( 1+2 ' 3 )n 1 /( 1+2/3 )). 



For all j > Ji, set 9j^ = and for j < J± — l let 9j t k be the local thresholding 
estimator given in (6). The radius is found by analyzing the loss 



(17) 



j-i v 

EE< 

j=0 k=l 



Ji-1 V 

££( 

j=0 k=l 



J-l 2J 

+ EE 



n2 



j=J 1 k=l 

The first of these terms is handled similarly to that used in (7) and (8). 
The second component in the loss Y^jZ^ SfcLi $f k ^ s a quadratic functional. 
It can be estimated well by using an unbiased estimate of Z)/= Ji Sfe=i fc 
where J2 is the largest integer satisfying 2 J2 < min(iV, M 4 /( 1+4 ^n 2// ( 1+4/3 )) 

and then bounding the tail Z)/=j 2 SfcLi k from above. 
More specifically, set the squared radius 

r 2 =c Q M 2 /( 1+4 «n- 4 ^( 1+4 « 
J1-1 



?'=0 \ i I 



(18) 



1/2 



Ji-1 



+ (2A* + 8A 
J2-1 v 

+ E J2(vl 

j=Ji fc=i 



3=0 i 



uk 



n 



where 



c Q = 2 2/3 (l-2- 



1 + 21og 1 / 2 (^ 

.a 



+ 



21og 1/2 (^) +z Q/4 -2 5 / 2 Ay 2 (l -2" 



2/3^1/(2+4/3) 



+ z Q/4 -2 /3+1 (l 



-2/3^-1/2 
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x M l/(l+2/3)-2/(l+4/3) ri l/(2+4/3)-l/(l+4/3) I _ 

Note that the last term in c a tends to as n — > oo or M — > oo. 

The following theorem shows that the confidence ball CB* defined by 

(19) CB* = {9:\\8-8\\ 2 <r a } 
has adaptive radius and desired coverage probability. 

Theorem 4. Fix < a < \ and let the confidence ball CB* be given as 
in (19). Then, for any r > (3, 

inf P(6eCB*) 

eeB; q (M) 

(20) >(l-a) 

_ [n- 1 + 3(1 - 2 - 2 / 3 )- 1 /(l+2 ( 0)] L -l M 2/(l+2/3) n -2/3/(l+2/3)_ 

For t < 2(3, 

(21) sup E{r 2 a ) <C T mm{M 2 l^ +2T ^n- 2T ^ l+2T \Nn- 1 ) 
and for t > 2(3, 

(22) sup J B(r2)<C^min(M 2 /( 1 + 4 «7i- 4 ^( 1+4 «,^n- 1 ), 
eeB^ q (M) 

where C T and Cp are constants depending only on t and (3, respectively. 

Theorem 4 taken together with Theorem 3 shows that the confidence ball 
CB* is adaptive over a maximal range of Besov bodies 

(23) C = {Bl q (M):re [f3,2(3},P > 2,g > 1} 

when either N >n 2 or N = n p , < p < 2 and > ^ — ^. In addition, the 
results also show that the confidence ball CB* still has guaranteed coverage 
over Bp q (M) for r > 2(3 although the maximum expected radius is neces- 
sarily inflated. 

4. Adaptive confidence balls with coverage over M . In Section 3 it 
was assumed that the mean vector belongs to a Besov body B@ q (M) and 
the confidence ball was constructed to ensure that it had a prespecified 
coverage probability over that Besov body. Under this constraint there are 
two situations where the confidence ball has expected squared radius that 
adapts over the Besov bodies Bp q (M) with r between (3 and 2(3, namely, 
when N >n 2 or when N = n p with < p < 2 and (3 > ^- — In both cases 
this is the largest range over which adaptation is possible. 
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We now turn to a construction of "honest" confidence balls which have 
guaranteed coverage over all ofR^. For the case when N = n, such "honest" 
confidence balls, those with a guaranteed coverage probability over all of R , 
was a topic pioneered in [11]. See also [2] and [3]. Li [11] was the first to 
show, when N = n, that any "honest" confidence ball must have a minimum 
expected squared radius of order re" 1 / 2 . In fact, using the lower bounds in 
Theorem 1 for the level-by-level case, it is easy to see that for any confidence 
interval with coverage over all of W N the random radius must in general 
satisfy 

(24) E (rl) > i^p^(log(l + e 2 )) 1 / 2 • N^n~\ 

Once again, for the case when N = re, Li [11] also showed how to construct 
"honest" confidence balls with maximum expected squared radius of order 
re" 1 / 2 over a parameter space where a linear estimator can be constructed 
with maximum risk of order n -1 / 2 . Such estimators exist when the parameter 
space only consists of sufficiently smooth functions. In particular, for the 
Besov bodies B@(M) with p > 2 Donoho and Johnstone [7] showed that 
the minimax linear risk is of order ri _2 / 3 /( 1 + 2 ^) and the methodology of Li 
[11] then leads to "honest" confidence balls with maximum expected squared 
radius converging at a rate of n" 1 / 2 over Besov bodies B^ q (M) if > \ and 
p > 2. However this approach is not adaptive over Besov bodies B^ q {M) 
with fi<\. 

In this section "honest" confidence balls are constructed over ~M. N which 
simultaneously adapt over a maximal range of Besov bodies. Attention is 
focused on the case where N < re 2 since, from (24), if N > ?i 2 , the minimum 
expected squared radius of such "honest" confidence balls does not even 
converge to zero. 

The confidence ball is built by applying the single level construction given 
in Section 2 level by level. In particular, the center of the confidence ball 
is obtained by block thresholding all the observations in blocks of size L = 
log re. For each index (J,k) in the block, say, B\ the estimate of 9jk is given 
by 

(25) 9 hk = y jik ■ I{Sli > A* ire" 1 ) 

where A* = 6.9368. The center of the confidence ball 6 is then defined by 
9 = (6j t k)- The construction of the radius is once again based on an analysis 
of the loss || — #|| I and applies the same technique as that given in Section 
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2. Set 



(26) 



J-i 



21ogV^j+4Ay 2 V 



#1/3 



71 



+ (2A* + 8Ai /2 - ljLn" 1 Card{i : S 2 ^ > A*Ln -1 }. 

With given in (25) and r Q given in (26) the confidence ball is then defined 
by 

(27) CB*(6,r a ) = {6:\\e-6\\ 2 <r a }. 

Theorem 5. Let the confidence ball CB*(9,r a ) be given as in (27). 
Then 

(28) hd w P(p€ CB*(6,r a ))> 1 - a - 2(logn)~ 1 
and, if M > 1, 



sup E{r z c 
(29) OeB^JM) 



< 



a 



21og 1 / 2 f-) +4A* /2 V2 +4 



JVV2 



n 



where C T > is a constant depending only on r. 

It is also interesting to understand Theorem 5 from an asymptotic point 
of view. Fix < p < 2 and let N = n p . It then follows from Theorem 5 
that the confidence ball constructed above has adaptive squared radius over 
Besov bodies Bp q (M) with r < - — \ and has maximum expected squared 



radius of order n 1 ^ 2 over Besov bodies with r 



> 



i. Note that the range 



note that for r < \ and M > 1 it follows that 

(30) sup E{r 2 a ) <C r min(l,M 2 /( 1+2T )n- 2r /( 1+2r )) 
eeBi q (M) 

and hence, although the confidence ball CB* depends only on n and the 
confidence level, it adapts over the collection of all Besov bodies BSJM) 
with /3< |, 

(31) C = {BP q (M) : < < \,p > 2,q > 1,M > 1}. 

This is the maximal range of Besov bodies over which honest confidence 
balls can adapt. In addition, it follows from (29) that the confidence ball has 
maximum expected squared radius within a constant factor of the smallest 
maximum expected squared radius for all Besov bodies B^ q {M) with /3 > 
and M > 1 among all confidence balls which have a prespecified coverage 
probability over 
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5. Proofs. In this section proofs of the main theorems are given except 
for Theorem 2. The proof of Theorem 2 is analogous although slightly easier 
than that given for Theorem 4. 

5.1. Proof of Theorems 1 and 3. Theorems 1 and 3 give lower bounds 
for the squared radius of the confidence balls. A unified proof of these two 
theorems can be given. We begin with a lemma on the minimax risk over a 
hypercube. 

Lemma 1. Suppose yi = 6i + o~Zi, zi *'~ ! ' -/V(0, 1) and i = 1, . . . ,m. Let 
a > 0, and set C m (a) = {9 £ M m : 9i = =ba, i = 1, . . . , m}. Let the loss function 
be 

m 

(32) L(e,e)=Y J I{\0i-Gi\>a). 

i=l 

Then the minimax risk over C m (a) satisfies 

m 

inf sup E(L0, 6)) = inf sup V P{\6i - 0*| > a) 

(33) ^ eec m (a) e eec m {a) i=1 

= *(-£)m, 

where <&(•) is the cumulative distribution function for the standard Normal 
distribution. 

Proof. Let 7!i, i = 1,.. . ,m, be independent with 7Tj(a) = vrj ( — a) = ^. 
Let 7r = 11^=1 be the product prior on € C m {a). The posterior distribu- 
tion of given y can be easily calculated as Pg\ y (0) = YliLiPg i \ yi (0i) where 

e 2ayi/cr 2 J 
^1*00 = l + e 2a W /^ • ^ = 0) + l + e *W«» " m = " a) - 

The Bayes estimator 6 W under the prior tt and loss L(-,-) given in (32) is 
then the minimizer of E e \ y L(6,6) = YlTLi P%(|#i — &i\ > a). A solution is 
then given by the simple rule 6f = a if > 0, Of = — a if yi < 0. The risk of 
the Bayes rule 6 n equals 

m 
i=l 

(34) 



[ip(y; < 0|fli = a) + ^P( yi > O|0i = -a)} 



$1 — — )m. 
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Since the risk of the Bayes rule 9 n is a constant, it equals the minimax risk. 

□ 

The proofs of Theorems 1 and 3 are also based on a bound on the L\ dis- 
tance between a multivariate normal distribution with mean and a mixture 
of normal distributions with means supported on the union of vertices of a 
collection of hyperrectangles. Let C(a,k) be the set of iV-dimensional vec- 
tors of which the first k coordinates are equal to a or —a and the remaining 
coordinates are equal to 0. Then Card(C(a, k)) = 2 k . Let P k be the mixture 
of Normal distributions with mean supported over C(a,k), 

( 35 ) Pk = ¥ E $ 0,i/v^v 

0SC(a,fc) 

where &o,cr,N is the Normal distribution N(9,<j 2 In)- Denote by 4>e,a,N the 
density of <&o,a,N and set P = ^ ,i/^,n- 

Lemma 2. Fix < e < 1 and suppose ka 4 n 2 < log(l + e 2 ). Then 

(36) Li(P ,P fc )<£. 

In particular, if A is any event such that Po(A) > a, then 

(37) P k (A)>a-e, 

where P k is the mixture of Normal distributions given in (35). 

Proof. The chi-squared distance between the distributions P k and Po = 

<&o,i/y/n,N satisfies f ^ < e fca4 " 2 < 1 + e 2 and consequently the L\ distance 
between Po and P k satisfies 

L 1 (P ,P fe ) = J \dP -dP k \ < (j tjL-l^' 2 <e. 

Hence, MPq{A) > a, then P k (A) > Po(A) - Li(P , P k ) > a - e and the lemma 
follows. □ 

Proof of Theorems 1 and 3. We first prove the bound (3). Fix a 
constant e satisfying < e < ~ a ) an d note that z a+ 2 e > 0. Take m = 2 J , 
a = n~ 1 / 2 and a = min(z a _|_2 e n _1 ' /2 , M2 _J ^ +1 / 2 )) in Lemma 1 and let C m (a) 
be defined as in Lemma 1. Then every A r -dimensional vector with the jth 
level coordinates 9j in C m (a) and other coordinates equal to zero is contained 
in P>P q (M). It then follows from Lemma 1 that 

m m 

inf sup *Y[ P(\0j,k ~ Qj,k\ >a) > hif sup V P{\9 j>k - 9 j>k \ > a) 

> (a + 2e)m. 
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For any 6, set X e = Ek=i I{\d jjk - 6 jjk \ > a). Then X e < m. Let 7 = 
Then 

(a + 2e)m< sup E(Xq) < sup {7mP(Is < 7m) + mP(le > 7m)}. 

It follows that sup egfi /3 ^ P{Xq > 7m) > a + e and consequently 

(39) sup P(||^-%|||>7ma 2 )> sup > 7m) > a + e. 

eeBl q (M) eeB^ q (M) 

Suppose CB(9,r a ) = {6j : \\6j — 9j\\2 < r a } is a 1 — a level confidence ball 
over B^ q (M). Then inf 0gB/3 (A/) - 9\\l <r 2 )>l-a and hence 

sup P(r^> jma 2 ) > sup P(^ma 2 < \\6j — 6j\\\ < r 2 ) 

6eBl q (M) 8eB^ q (M) 

> a + e + 1 — a — l = e. 

Thus for any e satisfying < e < \{\ ~ a )' su PeeB /3 (A/)^( r a) — £J ma2 , 
which completes the proof of (3). The proof of (12) is quite similar. Let f be 
the largest integer satisfying 2^" < min(7V, (i_2-9(/3+i/2))2/(<?(i+2/3)) z -V(i+2/3) x 
j^-2/(i+2/3) n i/(i+2/3)^ Equation (12) in Theorem 3 follows from Lemma 1 by 
taking m = 2-? , a = n~ l l 2 and a = z Q , + 2 £ n -1 / 2 . 

We now turn to the proof of (4) and (15). For (4) apply Lemma 2 with 
k = 2i and a = min(Af2^'^ +1 / 2 ),7 1 / 4 2-j/4 n -i/4)_ It is easy to cliec k by 

using the first term in the minimum that 2^ s 2^ v a < M. Hence the sequence 
which is equal to a or —a on the jth level and otherwise zero satisfies the 
Besov constraint (2). Moreover, using the second term in the minimum, it is 
clear that ka n 2 < 7. For (15) the above remarks hold with j replaced by J 
and it is clear that the collection C{a, k) of all such sequences is contained 
in B@ q (M). It then follows from Lemma 2 that, for P^ defined by (35), 
Li(Po)-Pfc) < £ and so 

(40) P fc (0G CB{5,r a ))>\-a-e. 

Now since for all 9 G C(a,k), P{9 G CB(5,r a )) > I- a and hence P({C(a, k)f] 
CB(5,r a ) / 0}) > 1 - a, it follows that 

(41) P fe ({C(a, k) n C5(5,r Q ) / 0}) > 1 - a. 

The Bonferroni inequality applied to equations (40) and (41) then yields 

(42) P fc (0G CB(6, r a )n{C(a,k)r\ CB(6, r a )^0})>l -2a- e. 
Once again, since Li(Po)-Pfc) < £ it follows that 

(43) P (0 G CB(S,r a ) n {C(a,k) n CB(S,r a ) / 0}) > 1 - 2a - 2e. 



16 



T. T. CAI AND M. G. LOW 



Now note that for all 9 £ C(a,k), \\9\\ 2 = ak 1 / 2 = 2b £ . Hence, if CB(5,r a ) 
contains both and some point 6 S C(a,k), it follows that the radius r a > 
5 1| 1| 2 = an£ i consequently 

Po( r a > b £ ) > P (0 £ CB(6, r a ) n {C{a, k) n CB(5, r a ) / 0}) > 1 - 2a - 2e. 
□ 

5.2. Proof of Theorem 4. The proof of Theorem 4 is involved. We first 
collect in the following lemmas some preparatory results on the tails of chi- 
squared distributions and Besov bodies. The proofs of these lemmas are 
straightforward and is thus omitted here. See [6] for detailed proofs. 

Lemma 3. Let X m be a random variable having a central chi-squared 
distribution with m degrees of freedom. Ifd>0, then 



(44) P(X m > (1 + d)m) < ±e 



1 -(m/2)(d-log(l+<i)) 



and consequently P(X m > (1 + d)m) < ^ e -W^)d 2 m+(i/6)d 3 m _ // o < d < 1, 
then 



(45) P(X m < (1 - d)m) < e -(V4)<i 2 m_ 

Lemma 4. Let yi = 9i + azi, % = 1,2, ■■-,L, Zi l ~' iV(0, 1) and /ei A* = 
6.9368 be the constant satisfying A — log A = 5. 

(i) For r > let A T > 1 denote the constant satisfying A — log A = 1 + 
T^- #£f=i0f < (^-VK fLa 2 , then 

(46) P^y 2 >A*La 2 ^ <p(e^ A - L ) < K 2T/(1+2r)i - 

(ii) # Ef=i # 2 > 4A,Lcj 2 , t/ien 



(47) P[ W < A*La 2 ] < Pi V. 2 > A*L < \e~ 2L . 



Lemma 5. (i) For any 9 G Bp q (M) and any < m < J — 1, 

J-l 2^ 

(48) ^ ^ # 2 fc < (1 - 2- 2r )- 1 Af 2 2" 2rm . 

,7=mfc=l 
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(ii) For a constant a > 0, set T = -Ylu k\ e B j @j k > aLn^ 1 }. Then 

for p>2 

(49) sup Card(T) < DL' 1 M 2 ^ 1+2t \i 1 ^ 1+2t \ 

where D is a constant depending only on a and r. In particular, D can be 
taken as D = 3(1 - 2 - 2 -)~ 1 /(i+2t) g -i/(i+2t) . 

Proof of Theorem 4. The proof is naturally divided into two parts: 
expected squared radius and the coverage probability. First recall the nota- 
tion that for a given block 

— E y\ki = E anC ^ — Z j,k- 

(J,k)eBi {i,k)&Bi (j>fe)es| 

We begin with the expected squared radius. Let r > f3 and suppose 9 G 
B^ q (M). From (18) we have 

E e (r 2 a )=c a M^ 1+ ^n-^/( 1+4 ^ 

+ J fl E ° (Y,( S h - Ln~ l )I(S 2 ht < KLn- 1 )) 
j=o V i / + 

Ji-i 

(50) + (2A* + 8Ai /2 - l)Ln~ l ^ ^ P e (^ > A.Ln" 1 ) 

i=o i 

j=Ji fc=l 

= G\ + G2 + G3 + G4. 
We begin with the term G3. Let A r be defined as in Lemma 4 and set 

(51) li = {(j,i) :j <Ji~ > (VX- v/AT) 2 ^ 1 } 
and 

2 / / rr /r~\2 r „-i 



(52) 1 2 = < J x - Uj, <(V\*- v A T ) Ln~ }. 

It then follows from Lemmas 4 and 5 that 

J1-1 

i=o i (j,i)eXi 

(3,»)ez 2 
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< Card(2i) + \L~ X ^ ■ n' 2 ^ 1 ^ 
<min(L- 1 2 Jl ,,DL- 1 M 2 /( 1+2 ^n 1 /( 1 + 2 -)) 
+ iL- 1 2 Jl n- 2T /( 1+2r ) 
for some constant D depending only on r. Note that 2 Jl = min(iV, M 2 > ( 1+2 ^) x 

1/(1+2/?)) and 

SO 

Jl-1 

G 3 = (2A* + 8Ai /2 - l)LrT l E E ftOSj.i > A*Ln _1 ) 

(53) <Cmin(^SM 2 /( 1+2 «n- 2 ^( 1+2 ^,M 2 /( 1+2 ^^ 2T /( 1+2T )) 

+ Cmin(Arn- 1 ,M 2 /( 1+2 «n- 2 ' 3 /( 1+2 ' 3 ))-^ 2 ^ 1+2 -) 
< C min(iVn- 1 , M 2 /( 1+2 ^n- 2 ^ 1+2T ) ) . 

The term G 4 is easy to bound. When iV < M 2 ^ l+2 ^n l ^ l+2f3 \ Ji = J 2 and 
hence G 4 = 0. When N > M 2 /Q-+ 2 P) n 1 / ( - 1+2 P\ it follows from (48) in Lemma 
5 that 

Ja-X H 

(54) jWi fc=i 

< (1 - 2~ 2t )~ 1 M 2 2" 2tJi 

< C min(iVn- 1 , M 2 /( 1+2 ^n- 2 ^ 1+2T ) ) . 

We now turn to Let J T be the largest integer satisfying 2 Jt < min(iV, 

M 2/(l+2r) ra l/(l+2r))_ Writg 

3=0 \ i / + 

+ E^(E(^ - < A^Ln- 1 )] 

j = Jr V I / + 

= G21 + G22, 

where G22 = when J T = J\. Note that 

G 21 = E ^fe^ - L^Wli < KLn' 1 )) 
j=o V i / + 

(55) < ^(Vl)k- 1 

j=0 i 

< (A* - l)Ln~ 1 2 jT L _1 

< (A, - l)min(iVn- 1 ,M 2 /( 1+2T )^ 2T /( 1+2T )). 
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When N < M 2 /( 1+2r ) n 1 ^ 14 " 2 ^ , J T = J x and so G 22 = 0. On the other hand, 
when J T < Ji, 

G 22 = £ E^Sli - Ln- l )I(S 2 hl < KLn' 1 )] 

j=J T \ i / + 

< eVec^- 2 ^ -1 )) 

j=J T V i / + 

Ji-lf / \ 2^1/2 

^H?*-"-i } 

Ji-lf / \2>,l/2 

<E ^E4+2n-v+ 

j=J T \ i \ i / ) 

Ji-1 / \ 1/2 Ji-1 

<2-- 1/2 E E& + 2172 -" 1 E 2 J/2 + E E&- 

j=J T \ i / 3=Jt 3=Jt i 

Note that Y*$,i = £*Li 2 k < M 2 2~ 2t K It then follows that 

G 22 < 2 T+1 (1 - 2 -)- 1 M 1 /( 1+2T )n-( 1+4r )/( 2+4T ) 
+ 4M 1/(1+2/}) n" (1+4/3)/(2+4,}) 
+ 2 2t (1 - 2~ 2r )" 1 M 2 /( 1+2r )n- 2T /( 1+2T ) 

and so 

(56) G22 < Cmm(Nn-\M 2 ^ 1+2T K- 2T ^ 1+2 ^). 

This together with (50) and (53)-(55) yields 

sup E e (r 2 a )< sup (Gi + G21 + G 22 + G 3 + G 4 ) 

0ei% 9 (M) eeB- 9 (M) 

< c a min(iVn- 1 ,M 2 /( 1+4 ^n- 4 ^( 1+4 ^) 
+ G T min(iVn- 1 ,M 2 /( 1+2T )n- 2T /( 1+2T )) 

where G r is a constant depending only on r. For < r < /5 similar arguments 
yield 

sup E e (r 2 a ) < G T min(iVn^ 1 ,M 2 /( 1+2T )n- 2T /( 1+2T )). 
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We now turn to the coverage probability. Set C{9) = P{\\9 — 9^ > r 2 ) and 
fix r > 0. We want to bound sup 0eBT (m) C(9)- Note that 

\\6-ni= EEW^ -1 ) 

3=0 i 

J f:iixi i i(si i >\*Ln- i )+ EE*!*- 

It follows from (48) in Lemma 5 that 

J-l 2i 

SUP 

y bl > eeB^ q (M) j=j2k=1 



SU P E E 9 lk < (1 - 2- 2r )- 1 M 2 2" 2Tj2 



< 2 2/3 (l - 2- 2 / 3 )- 1 M 2 /( 1+4/3 )n- 4,9/(1+4 ' 3) . 
Set a = 2^(1 - 2" 2 / 3 )" 1 , ai = z a/4 ■ 2 5 / 2 \l /2 (l - 2 -^) 1/{2+4/3) x 

M l/(l+2/3)-2/(l+4/3) n l/(2+4/3)-l/(l+4/3) ) ^ = 2 log l/2 ( 4 ) M l/(l+2/3)-2/(l+4/3) x 
n l/(2+4 / 3)-l/(l+4/3) ) fl3 = Zq/4 . 2 /3+l (1 _ 2 -2/?)-l/2 M l/(l+2/?)-2/(l+4/3) x 

n l/(2+4/3) -1/(1+4/3) ; fl4 = 21og 1 / 2 (^) and a 5 = 2A* + 8A^ /2 - 1. Then c a in 
(18) equals ao + a\ + 02 + 03 + 04 and the squared radius r 2 given in (18) 
can be written as 



rl = (a + ax + a 2 + a 3 + a 4 )M 2 /( 1+4 «n- 4 ^ 1+4 ^ 

+ E fE(^ " Ln^nSl < X.Ln- 1 )) 
j=o \ i / + 

Jl-l J2-I 2J 

+ a 5 Ln- 1 E E / (^> A * L ^ 1 )+ E E^-™- 1 )- 

j=0 i j= Ji fc=l 

Set 1 3 = {{j, i):j<Ji- 1, ili > 4A*Ln _1 } and J 4 = {(j, i):j<Ji~ 1, 4 < 
4A*Ln~ 1 }. It then follows that 

C{9)<p\ E [^/(^^A.L^^+n-^^/^^A^n- 1 )] 

> e [(4- Lre " 1 ) / (4^ A * L ^ 1 ) 

(i,*)ei3 



+ a 5 £n- 1 /( ( S 2 i >A*Ln- 1 )]j 
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+ p \ E KM<A.k- 1 )+n- 1 xJ, j J(4>A»iii- 1 )] 



>(a 1 +a 2 )M 2 /( 1+4 ^n- 4 ' 3 /( 1+4 ^ 

+ £ [(^-Ln-^I^A^Ln- 1 ) 
(i,«)ex 4 



+ a 5 Ln _1 /(5y j > A*Ln -1 )] 



+ P f "£ E > ("3 + a 4 )M 2 /( 1+4 ^n- 4 ^( 1+4 ^ 



J 2 -l 2 J 

+ EE(6-» _1 ) 

j = Jx fc=l 



We shall consider the three terms separately. We first calculate the term T\ . 
Note that 



r!<p{ £ (5 2 l -^-Ln- 1 )/(5| /t <A,Ln- 1 )<o| 

I (i,i)6Z 3 J 

+ P \ E ™" 1 xi^(4 i >A* J Ln- 1 )> ]T a 5 Ln~ 1 I(Sj i > A*Ln _1 )| 
< E P(Sl i <KLn- 1 )+ J2 P(xh>a 5 L). 

It follows from Lemma 4(ii) that P(S$ ti < KLn~ l ) < P(x],i > KL) < \n~ 2 
for (j,i) Lemma 5 now yields 

Ti < n~ 2 ■ Card(J 3 ) 

(58) < 3(1 - 2 - 2 -)-V(l+2r) (4X) -l/(l +2 r) jL -l M 2/(l+2r) n -2r/(l+2r) 
< 3(1 - 2^ 2 ^)-V(l+2/3) L -l M 2/(l+2/3) n -2 / 9/(l+2 / 3)_ 

We now turn to the second term Ti- Note that 

T2=P\ E [(5 2 l -^-^- 1 ) + a 5 L^ 1 /(^>A,Ln- 1 )] 

<_( ai+a2 )M 2 /( 1+4 / 3 )n- 4 ^( 1+4 « 
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+ E [(Sl l + n-\l l -e j , i -Ln- 1 )I(Sl l >KLn- 1 )]\ 
< P \H (Sl i -^ i -Ln- 1 )<-(a 1 + a 2 )M 2 ^ 1+ ^n- 4 ^ 1+ ^ 
+ P { E {S^ + n-^-ili-Ln^IiSl^KLn- 1 ) 

> E asLn^IiSli > A*Ln _1 ) I 



0^)6X4 

= T21 + T22- 

For any given block, write 

S],i= E (O^ + n^z^) 2 

= il t + 2n- 1 ' 2 E ^fe^,fe + ™ -1 x|,i 

= Zl l + 2n- 1 / 2 t j , l Z j , l + n- 1 xl i , 
where Zj^ = ^jl k)eB j ®h kZ i> k ls a standard Normal variable. Then 

T 21 = p( (Sl l -^ l -Ln^)<-(a l +a 2 )M 2 /^n-^ 1 ^ 
<p| E (2n- 1 / 2 e i ,^ + n- 1 X ^-L^ 1 ) 



< _( 01 + a 2 )M 2 ^ 1+ ^n-^'^^ 
<pUn^ 2 ^A^<- a i M2/{1+mn ~ AP/(1+m ) 
+ P { E 4, < -a 2 M 2 /( 1 + 4 «n 1 /d+^) + Card(J 4 )L} 



-0,1)6X4 

2~211 + ?212. 
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Note that, for any < f < J\ — 1, 

[j,i)£?4 3=3' 

= 4A*2 J V 1 + M 2 (l - 2- 2r )- 1 2- 2r ' J ". 
Minimizing the right-hand side yields that E(j.i)ex 4 - 2(4A*) 2r/(1+2r) (1 - 

2 -2T)-l/(l+2T) M 2/(l+2r) n -2r/(l+2r) < _ 2 -2/3)-l/(l+2/3) M 2/(l+2/3) x 

ra -2/3/(i+2/3) _ Denote by Z a standard Normal random variable. It then fol- 
lows that T 211 = P(Z < -ia 1 M 2 /( 1+4 ^n 1 /(i+4/3) n -i/2 (Eo . ^Jj.)'^) < 

P{Z < z a/4 ) = f . Now consider the term T 212 . If Card(T 4 )L < a 2 M 2 ^ l+A ^n l ^ l+ ^\ 
then T 212 = 0. Now suppose Card(J 4 )L > a 2 M 2 /( 1+4 ^)n 1 /( 1 + 4 ' 3 ). It follows 
from (45) in Lemma 3 by taking m = Card(J 4 )L < 2 Jl and d = a 2 M 2 ^ 1+i ^ x 

n l/(l+4/3) /m that T212 < exp( _l a 2 M 4/(l+4/3)-2/(l+2/3) n 2/(l+4/3)-l/(l+2/3) ) = 

j and hence 

(59) T 21 = T 211 + T 212 <|. 

We now consider the term T 22 . Simple algebra yields that 

T 22 = P( {Sli + n-Wt-tli-Ln-^IiSl^KLn- 1 ) 

> Y a 5 Ln- 1 /(Sf i > A.Ln- 1 )) 

{3,i)£l4 / 

< P(Z j , l > 1 2 ql(ar -2K + l)Ln~ 1 / 2 ) 

Q',»)ez 4 

Note that £ 2 i < 4A*Ln _1 for (j, i) 6 Z4. Hence it follows from the bounds on 
the tail probability of standard Normal and central chi-squared distributions 
that 



n 2 



T 22 < £ P(Z i , l >2(log ? i) 1 / 2 )+ J2 \ 
(60) (j,i)ex 4 (j,i)ei4 

< L- 1 M 2 ^ 1+2 ^n- 2f3 ^ 1+2 ^n' 1 . 

We now turn to the third term T 3 . Note that y 2 k = 6 2 k + 2n~ l / 2 6j^ k Zj^ + 
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n~ 1 z'j k and so 

/ J 2 -i 2J 

T 3 = P 2n-V2£ E*J,k*i,k 

\ j = J! fe=l 



+ *E X>*,k - 1) < -(«3 + a 4 )M 2 /( 1+4 ' 3 )n- 4 ^( 1+4 ' 3 )') 



j=Ji fc=l / 

V j=j 1 k=i J 

+ P ( E 4* < ( 2J2 - 2 Jl ) - a 4 M 2 /( 1+4 «n 1 /( 1+4 ^ 
\j=j 1 fc=i / 

T31 + T%2- 

„2 _ /i2 QT , J 7 _ V-2J 



Set r = Ejlj, Et,i ™d Z = 7"' Ef.'j' Efc! Then Z is a 

standard Normal variable and it follows from (48) in Lemma 5 that 7 2 < 

2 2/3(l _ 2 -2/3)-l M 2/(l+2/3) n -2/3/(l+2/3)_ jj^^ 



T31 < P(Z < -2-?-\l - 2- 2 / 3 )V2 a3M 2/(l+4/ 3 )-l/(l + 2/3) 
= P(Z < -Z a/4 ) 



(61) X n l/(l+4/3)-l/(2 + 4/3 ); 

a 



4 

It follows from Lemma 3 with m = 2' h - 2 Jl and d = o 4 M 2 /( 1+4/3 )n 1 /( 1+4/3 ) /m 
that T32 < e( _1 / 4 ) a 4 = 2. Equation (20) now follows from this together with 
(58), (59), (60) and (61). □ 

5.3. Proof of Theorem 5. The proof of Theorem 5 is similar to that of 
Theorem 4. We shall omit some details and only give a brief proof here. 
Suppose 9 e B T pq {M). Set h = 21og 1/2 (^), b 2 = 4\l /2 z a/2 and 63 = 2A* + 
8 Ay 2 - 1. Then, from (26) we have 

E e {rl) = {b l +b 2 )N 1 l 2 n- 1 

(62) + £ E e ( E(^,i - Ln ~ 1 ) I ( S h < ^Ln- 1 )) 

i=o \ i / + 

+ b ?J Ln~ 1 E e {Csxd{{j,i):Sl i > KLn^ 1 }). 

The last term can be easily bounded using Lemma 5 as 

bzLn- l E e {Cax&{{j,i) ■ S 2 jyi > KLn~ 1 }) 

< 63 • min^^Af 2 /^ 2 ^" 2 ^ 1 ^). 
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Set D = E/=o E e (J2i{Sj,i ~ Ln^HSh < KLn~ 1 )) + . Using nearly identical 
arguments given in the derivation of (55) and (56) in the proof of Theorem 4, 
D is bounded as D < AN l / 2 n~ l + C T M 2 /( 1+2r )n- 2T /( 1+2r ) for some constant 
C T > A*. On the other hand, it is easy to see that D < J2jZo J^ii^Ln" 1 — 
Ln~ l ) = (A* — l)Nn~ l and consequently supg 6S r (M)E{ r a) < (b\ + &2 + 

4 ) Ar l/2 n -l +Crmin ( Arn -l )M 2/(l+2r) n -2r/(l+2r)^ P ' 9 

We now turn to the coverage probability. Again, set C(0) = P{\\6 - 9\\l > 
r 2 ). We want to show that sup egR iv C{6) < a + 4(logn) _1 . Note that 

J-i 

j=0 i 

J-l 

+ n- 1 Y,T,xhnSl l >KLn- 1 ). 
j=0 i 

Set T 3 = > 4A,Ln^ 1 } and T 4 = < 4A*Ln~ 1 }- It then 

follows from the definition of the radius r a given in (26) that 

C{6) <P\ J2 l$,A S li < X * Ln ^) + ^xlASli > KLrT 1 )} 

> £ [(S^-Ln- 1 )/^ 2 ^^- 1 ) 

(7-0 625 

+ bzLn~ l I(S'j i > A^Lra -1 )] j 

+ £ [(^ - LrT 1 )/^, < KLn' 1 ) 

+6 3 Ln^ 1 /(S ,2 i > A^Ln" 1 )]! 

= Ti + T 2 . 

We first bound the term Ti. Similarly as in the proof of Theorem 4, 

(63) T,< (P(5 2 i <A,Ln- 1 ) + P( X 2 i >6 3 i))<n- 2 Card(^)<L- 1 . 
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On the other hand, note that 

T2 = P\ E [{Sli-e^-Ln-^ + bzLn^IiSl^KLn- 1 )] 

<-{b 1 + b 2 )N l ' 2 n~ l 

+ E [(Sl + n-^l-tl-Ln^IiSl^X^Ln- 1 )]] 

(1,0625 J 

<4 E (fi?, i -g i -Ln- 1 )<-(6i + 6 2 )JV 1/2 n- 1 ) 
4i,i)exj J 

+ W E (5|, l + n- 1 X ^-^-^ 1 )/(5| ii >A,Ln- 1 ) 

> E b 3 ^- 1 I(5? i >A,Ln- 1 )} 

= T21 + T22 . 

Set = ^ . A,-)e_B J 8j,kZj,k- Then Zj^ is a standard Normal random vari- 
able and 

T 2i = P\ E (2n-^ ji Z i , i + n- 1 4 < -i^- 1 )<-(6i + 6 2 )JV-V2 n -i| 

E xl^-hiV^ + Card^] 
4i,i)exj J 

+W E ^<-^ 1/2 -- 1/2 [ 

If Card(Tl)L < hN 1 / 2 , then P{E(j,i) e ^ X*,i < -biN 1 / 2 + Card(2l)L} = 0. 
When Card(J^)L > ^N 1 / 2 , equation (45) with m = Card(T^)L < N and 
d = b\N x l 2 jm yields that 

p\ E X?i<-6i^ 1/2 + Card(^)Ll<e(- 1 /4)«' 3 "»< e (-V4)^ = " 

On the other hand, note that J2(j,i)ei' 4 ^j,i - NL' 1 • 4A*Lra _1 = 4A*iVn~ 1 

and hence P{E(j,i)erJj,iZj,i < -^N^rT 1 ' 2 } < P(Z < -\b 2 \: l/2 ) < § 
where Z ~ iV(0, 1). 
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We now turn to the term T 22 . Note that & { < AX^Ln 1 for G T' A . 
Hence 

T 22 < n2n- 1 /%-4- i + 2n- 1 xi i >(fo3 + l)^ 1 ) 

< J2 P{Z j ,>\ill((h-2K + l)Ln- l l 2 ))+ P(x 2 j,>KL) 

< -P(^j,i > 2(logn) 1/2 ) + ^ \n~ 2 <LNir 2 <L~ X . 

U,i)ex' 4 (i,i)exi 

Hence, C(0) < Ti + T 2 i + T 22 < a + 2L" 1 = a + 2(log?i)~ 1 . 
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